Dataset statistics
| Number of variables | 25 |
|---|---|
| Number of observations | 4274187 |
| Missing cells | 40003182 |
| Missing cells (%) | 37.4% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 815.2 MiB |
| Average record size in memory | 200.0 B |
Variable types
| Numeric | 4 |
|---|---|
| DateTime | 2 |
| Text | 9 |
| Categorical | 10 |
DRIVER_LICENSE_STATUS is highly imbalanced (86.2%) | Imbalance |
VEHICLE_DAMAGE_3 is highly imbalanced (53.7%) | Imbalance |
PUBLIC_PROPERTY_DAMAGE is highly imbalanced (63.2%) | Imbalance |
STATE_REGISTRATION has 321735 (7.5%) missing values | Missing |
VEHICLE_TYPE has 247609 (5.8%) missing values | Missing |
VEHICLE_MAKE has 1899905 (44.5%) missing values | Missing |
VEHICLE_MODEL has 4222807 (98.8%) missing values | Missing |
VEHICLE_YEAR has 1921158 (44.9%) missing values | Missing |
TRAVEL_DIRECTION has 1673932 (39.2%) missing values | Missing |
VEHICLE_OCCUPANTS has 1793977 (42.0%) missing values | Missing |
DRIVER_SEX has 2252000 (52.7%) missing values | Missing |
DRIVER_LICENSE_STATUS has 2346168 (54.9%) missing values | Missing |
DRIVER_LICENSE_JURISDICTION has 2342040 (54.8%) missing values | Missing |
PRE_CRASH has 928192 (21.7%) missing values | Missing |
POINT_OF_IMPACT has 1707717 (40.0%) missing values | Missing |
VEHICLE_DAMAGE has 1733402 (40.6%) missing values | Missing |
VEHICLE_DAMAGE_1 has 2633928 (61.6%) missing values | Missing |
VEHICLE_DAMAGE_2 has 3034451 (71.0%) missing values | Missing |
VEHICLE_DAMAGE_3 has 3320488 (77.7%) missing values | Missing |
PUBLIC_PROPERTY_DAMAGE has 1528858 (35.8%) missing values | Missing |
PUBLIC_PROPERTY_DAMAGE_TYPE has 4246765 (99.4%) missing values | Missing |
CONTRIBUTING_FACTOR_1 has 153529 (3.6%) missing values | Missing |
CONTRIBUTING_FACTOR_2 has 1694521 (39.6%) missing values | Missing |
VEHICLE_YEAR is highly skewed (γ1 = 55.67830854) | Skewed |
VEHICLE_OCCUPANTS is highly skewed (γ1 = 980.7930925) | Skewed |
UNIQUE_ID has unique values | Unique |
VEHICLE_OCCUPANTS has 428660 (10.0%) zeros | Zeros |
Reproduction
| Analysis started | 2024-10-29 14:09:26.288214 |
|---|---|
| Analysis finished | 2024-10-29 14:12:17.330572 |
| Duration | 2 minutes and 51.04 seconds |
| Software version | ydata-profiling v0.0.dev0 |
| Download configuration | config.json |
UNIQUE_ID
Real number (ℝ)
UNIQUE 
| Distinct | 4274187 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 16643926 |
| Minimum | 111711 |
|---|---|
| Maximum | 20771083 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 32.6 MiB |
Quantile statistics
| Minimum | 111711 |
|---|---|
| 5-th percentile | 9719068.3 |
| Q1 | 14589050 |
| median | 17595763 |
| Q3 | 19526220 |
| 95-th percentile | 20508925 |
| Maximum | 20771083 |
| Range | 20659372 |
| Interquartile range (IQR) | 4937170 |
Descriptive statistics
| Standard deviation | 3367520.5 |
|---|---|
| Coefficient of variation (CV) | 0.20232729 |
| Kurtosis | -0.36060851 |
| Mean | 16643926 |
| Median Absolute Deviation (MAD) | 2485262 |
| Skewness | -0.81830805 |
| Sum | 7.1139251 × 1013 |
| Variance | 1.1340194 × 1013 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 10385780 | 1 | < 0.1% |
| 17862678 | 1 | < 0.1% |
| 18545668 | 1 | < 0.1% |
| 17112286 | 1 | < 0.1% |
| 17954158 | 1 | < 0.1% |
| 17500925 | 1 | < 0.1% |
| 17822072 | 1 | < 0.1% |
| 18387894 | 1 | < 0.1% |
| 17470706 | 1 | < 0.1% |
| 17717581 | 1 | < 0.1% |
| Other values (4274177) | 4274177 |
| Value | Count | Frequency (%) |
| 111711 | 1 | |
| 111712 | 1 | |
| 115530 | 1 | |
| 115531 | 1 | |
| 120620 | 1 | |
| 123422 | 1 | |
| 123423 | 1 | |
| 199289 | 1 | |
| 199290 | 1 | |
| 199291 | 1 |
| Value | Count | Frequency (%) |
| 20771083 | 1 | |
| 20771082 | 1 | |
| 20771081 | 1 | |
| 20771080 | 1 | |
| 20771079 | 1 | |
| 20771078 | 1 | |
| 20771077 | 1 | |
| 20771072 | 1 | |
| 20771071 | 1 | |
| 20771070 | 1 |
COLLISION_ID
Real number (ℝ)
| Distinct | 2127445 |
|---|---|
| Distinct (%) | 49.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3213661.5 |
| Minimum | 22 |
|---|---|
| Maximum | 4766163 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 32.6 MiB |
Quantile statistics
| Minimum | 22 |
|---|---|
| 5-th percentile | 110645 |
| Q1 | 3174562.5 |
| median | 3709203 |
| Q3 | 4239995 |
| 95-th percentile | 4659439 |
| Maximum | 4766163 |
| Range | 4766141 |
| Interquartile range (IQR) | 1065432.5 |
Descriptive statistics
| Standard deviation | 1498334.6 |
|---|---|
| Coefficient of variation (CV) | 0.46623909 |
| Kurtosis | 0.10049334 |
| Mean | 3213661.5 |
| Median Absolute Deviation (MAD) | 532713 |
| Skewness | -1.2597238 |
| Sum | 1.373579 × 1013 |
| Variance | 2.2450066 × 1012 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 4691158 | 42 | < 0.1% |
| 4539133 | 40 | < 0.1% |
| 4275782 | 25 | < 0.1% |
| 4324675 | 22 | < 0.1% |
| 3925685 | 22 | < 0.1% |
| 4541337 | 22 | < 0.1% |
| 306480 | 21 | < 0.1% |
| 4625450 | 20 | < 0.1% |
| 4578189 | 19 | < 0.1% |
| 3187017 | 19 | < 0.1% |
| Other values (2127435) | 4273935 |
| Value | Count | Frequency (%) |
| 22 | 2 | |
| 23 | 2 | |
| 24 | 2 | |
| 25 | 2 | |
| 26 | 2 | |
| 27 | 2 | |
| 28 | 2 | |
| 29 | 2 | |
| 30 | 2 | |
| 31 | 2 |
| Value | Count | Frequency (%) |
| 4766163 | 2 | < 0.1% |
| 4766160 | 2 | < 0.1% |
| 4766157 | 2 | < 0.1% |
| 4766156 | 2 | < 0.1% |
| 4766155 | 2 | < 0.1% |
| 4766154 | 2 | < 0.1% |
| 4766152 | 2 | < 0.1% |
| 4766151 | 1 | < 0.1% |
| 4766150 | 1 | < 0.1% |
| 4766148 | 6 |
CRASH_DATE
Date
| Distinct | 4497 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 32.6 MiB |
| Minimum | 2012-07-01 00:00:00 |
|---|---|
| Maximum | 2024-10-22 00:00:00 |
CRASH_TIME
Date
| Distinct | 1440 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 32.6 MiB |
| Minimum | 2024-10-29 00:00:00 |
|---|---|
| Maximum | 2024-10-29 23:59:00 |
VEHICLE_ID
Text
| Distinct | 2745362 |
|---|---|
| Distinct (%) | 64.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 32.6 MiB |
Length
| Max length | 36 |
|---|---|
| Median length | 36 |
| Mean length | 20.863062 |
| Min length | 1 |
Characters and Unicode
| Total characters | 89172629 |
|---|---|
| Distinct characters | 17 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 2745343 ? |
|---|---|
| Unique (%) | 64.2% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 0553ab4d-9500-4cba-8d98-f4d7f89d5856 |
| 3rd row | 2 |
| 4th row | 1 |
| 5th row | 1 |
| Value | Count | Frequency (%) |
| 1 | 769061 | 18.0% |
| 2 | 694883 | 16.3% |
| 3 | 50530 | 1.2% |
| 4 | 10398 | 0.2% |
| 5 | 2608 | 0.1% |
| 6 | 791 | < 0.1% |
| 7 | 281 | < 0.1% |
| 8 | 130 | < 0.1% |
| 9 | 69 | < 0.1% |
| 10 | 36 | < 0.1% |
| Other values (2745352) | 2745400 |
Most occurring characters
| Value | Count | Frequency (%) |
| - | 9481656 | 10.6% |
| 4 | 7045658 | 7.9% |
| 1 | 5540470 | 6.2% |
| 2 | 5379107 | 6.0% |
| 8 | 5255399 | 5.9% |
| 9 | 5252223 | 5.9% |
| b | 5041156 | 5.7% |
| a | 5033437 | 5.6% |
| 3 | 4717854 | 5.3% |
| 5 | 4667774 | 5.2% |
| Other values (7) | 31757895 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 51835473 | |
| Lowercase Letter | 27855500 | |
| Dash Punctuation | 9481656 | 10.6% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 4 | 7045658 | |
| 1 | 5540470 | |
| 2 | 5379107 | |
| 8 | 5255399 | |
| 9 | 5252223 | |
| 3 | 4717854 | |
| 5 | 4667774 | |
| 6 | 4660160 | |
| 0 | 4658578 | |
| 7 | 4658250 |
Lowercase Letter
| Value | Count | Frequency (%) |
| b | 5041156 | |
| a | 5033437 | |
| e | 4448371 | |
| f | 4445150 | |
| c | 4443797 | |
| d | 4443589 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 9481656 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 61317129 | |
| Latin | 27855500 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| - | 9481656 | |
| 4 | 7045658 | |
| 1 | 5540470 | |
| 2 | 5379107 | |
| 8 | 5255399 | |
| 9 | 5252223 | |
| 3 | 4717854 | |
| 5 | 4667774 | |
| 6 | 4660160 | |
| 0 | 4658578 |
Latin
| Value | Count | Frequency (%) |
| b | 5041156 | |
| a | 5033437 | |
| e | 4448371 | |
| f | 4445150 | |
| c | 4443797 | |
| d | 4443589 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 89172629 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| - | 9481656 | 10.6% |
| 4 | 7045658 | 7.9% |
| 1 | 5540470 | 6.2% |
| 2 | 5379107 | 6.0% |
| 8 | 5255399 | 5.9% |
| 9 | 5252223 | 5.9% |
| b | 5041156 | 5.7% |
| a | 5033437 | 5.6% |
| 3 | 4717854 | 5.3% |
| 5 | 4667774 | 5.2% |
| Other values (7) | 31757895 |
MISSING 
| Distinct | 82 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 321735 |
| Missing (%) | 7.5% |
| Memory size | 32.6 MiB |
Length
| Max length | 2 |
|---|---|
| Median length | 2 |
| Mean length | 1.9999997 |
| Min length | 1 |
Characters and Unicode
| Total characters | 7904903 |
|---|---|
| Distinct characters | 26 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 5 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | NY |
|---|---|
| 2nd row | NY |
| 3rd row | NY |
| 4th row | NY |
| 5th row | NY |
| Value | Count | Frequency (%) |
| ny | 3287970 | |
| nj | 241929 | 6.1% |
| pa | 89407 | 2.3% |
| fl | 49062 | 1.2% |
| ct | 44851 | 1.1% |
| va | 19812 | 0.5% |
| ma | 18723 | 0.5% |
| md | 18474 | 0.5% |
| nc | 17548 | 0.4% |
| ga | 14760 | 0.4% |
| Other values (72) | 149916 | 3.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| N | 3575709 | |
| Y | 3289221 | |
| J | 241929 | 3.1% |
| A | 165299 | 2.1% |
| P | 90443 | 1.1% |
| C | 78150 | 1.0% |
| T | 67505 | 0.9% |
| L | 63740 | 0.8% |
| M | 50650 | 0.6% |
| F | 49486 | 0.6% |
| Other values (16) | 232771 | 2.9% |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 7904903 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| N | 3575709 | |
| Y | 3289221 | |
| J | 241929 | 3.1% |
| A | 165299 | 2.1% |
| P | 90443 | 1.1% |
| C | 78150 | 1.0% |
| T | 67505 | 0.9% |
| L | 63740 | 0.8% |
| M | 50650 | 0.6% |
| F | 49486 | 0.6% |
| Other values (16) | 232771 | 2.9% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 7904903 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| N | 3575709 | |
| Y | 3289221 | |
| J | 241929 | 3.1% |
| A | 165299 | 2.1% |
| P | 90443 | 1.1% |
| C | 78150 | 1.0% |
| T | 67505 | 0.9% |
| L | 63740 | 0.8% |
| M | 50650 | 0.6% |
| F | 49486 | 0.6% |
| Other values (16) | 232771 | 2.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 7904903 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| N | 3575709 | |
| Y | 3289221 | |
| J | 241929 | 3.1% |
| A | 165299 | 2.1% |
| P | 90443 | 1.1% |
| C | 78150 | 1.0% |
| T | 67505 | 0.9% |
| L | 63740 | 0.8% |
| M | 50650 | 0.6% |
| F | 49486 | 0.6% |
| Other values (16) | 232771 | 2.9% |
VEHICLE_TYPE
Text
MISSING 
| Distinct | 2856 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 247609 |
| Missing (%) | 5.8% |
| Memory size | 32.6 MiB |
Length
| Max length | 38 |
|---|---|
| Median length | 30 |
| Mean length | 16.558065 |
| Min length | 1 |
Characters and Unicode
| Total characters | 66672340 |
|---|---|
| Distinct characters | 75 |
| Distinct categories | 10 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1733 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | PASSENGER VEHICLE |
|---|---|
| 2nd row | Station Wagon/Sport Utility Vehicle |
| 3rd row | TAXI |
| 4th row | PASSENGER VEHICLE |
| 5th row | PASSENGER VEHICLE |
| Value | Count | Frequency (%) |
| vehicle | 1652165 | |
| utility | 1199705 | |
| station | 1199628 | |
| sedan | 1159968 | |
| wagon/sport | 861699 | |
| passenger | 770780 | |
| 340846 | 3.6% | |
| wagon | 338052 | 3.6% |
| sport | 337927 | 3.6% |
| truck | 182549 | 1.9% |
| Other values (1504) | 1332960 |
Most occurring characters
| Value | Count | Frequency (%) |
| 5349701 | 8.0% | |
| S | 5141530 | 7.7% |
| t | 4381445 | 6.6% |
| i | 3715221 | 5.6% |
| E | 3410448 | 5.1% |
| e | 3086621 | 4.6% |
| a | 3065069 | 4.6% |
| n | 2927200 | 4.4% |
| o | 2755960 | 4.1% |
| T | 2170644 | 3.3% |
| Other values (65) | 30668501 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 29908670 | |
| Uppercase Letter | 29852441 | |
| Space Separator | 5349701 | 8.0% |
| Other Punctuation | 1202688 | 1.8% |
| Decimal Number | 134918 | 0.2% |
| Dash Punctuation | 113331 | 0.2% |
| Open Punctuation | 55299 | 0.1% |
| Close Punctuation | 55287 | 0.1% |
| Modifier Symbol | 4 | < 0.1% |
| Control | 1 | < 0.1% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 5141530 | |
| E | 3410448 | |
| T | 2170644 | 7.3% |
| I | 1987061 | 6.7% |
| N | 1819684 | 6.1% |
| V | 1797120 | 6.0% |
| A | 1636991 | 5.5% |
| U | 1386891 | 4.6% |
| R | 1361548 | 4.6% |
| W | 1308892 | 4.4% |
| Other values (16) | 7831632 |
Lowercase Letter
| Value | Count | Frequency (%) |
| t | 4381445 | |
| i | 3715221 | |
| e | 3086621 | |
| a | 3065069 | |
| n | 2927200 | |
| o | 2755960 | |
| l | 1797968 | 6.0% |
| d | 1252683 | 4.2% |
| r | 1219100 | 4.1% |
| c | 1173762 | 3.9% |
| Other values (15) | 4533641 |
Decimal Number
| Value | Count | Frequency (%) |
| 4 | 100321 | |
| 6 | 28621 | 21.2% |
| 2 | 4896 | 3.6% |
| 3 | 694 | 0.5% |
| 1 | 132 | 0.1% |
| 0 | 112 | 0.1% |
| 5 | 83 | 0.1% |
| 9 | 28 | < 0.1% |
| 8 | 19 | < 0.1% |
| 7 | 12 | < 0.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 1202629 | |
| . | 29 | < 0.1% |
| # | 10 | < 0.1% |
| , | 6 | < 0.1% |
| ? | 5 | < 0.1% |
| ' | 4 | < 0.1% |
| \ | 3 | < 0.1% |
| & | 2 | < 0.1% |
Space Separator
| Value | Count | Frequency (%) |
| 5349701 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 113331 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 55299 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 55287 |
Modifier Symbol
| Value | Count | Frequency (%) |
| ` | 4 |
Control
| Value | Count | Frequency (%) |
| | 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 59761111 | |
| Common | 6911229 | 10.4% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| S | 5141530 | 8.6% |
| t | 4381445 | 7.3% |
| i | 3715221 | 6.2% |
| E | 3410448 | 5.7% |
| e | 3086621 | 5.2% |
| a | 3065069 | 5.1% |
| n | 2927200 | 4.9% |
| o | 2755960 | 4.6% |
| T | 2170644 | 3.6% |
| I | 1987061 | 3.3% |
| Other values (41) | 27119912 |
Common
| Value | Count | Frequency (%) |
| 5349701 | ||
| / | 1202629 | 17.4% |
| - | 113331 | 1.6% |
| 4 | 100321 | 1.5% |
| ( | 55299 | 0.8% |
| ) | 55287 | 0.8% |
| 6 | 28621 | 0.4% |
| 2 | 4896 | 0.1% |
| 3 | 694 | < 0.1% |
| 1 | 132 | < 0.1% |
| Other values (14) | 318 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 66672340 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 5349701 | 8.0% | |
| S | 5141530 | 7.7% |
| t | 4381445 | 6.6% |
| i | 3715221 | 5.6% |
| E | 3410448 | 5.1% |
| e | 3086621 | 4.6% |
| a | 3065069 | 4.6% |
| n | 2927200 | 4.4% |
| o | 2755960 | 4.1% |
| T | 2170644 | 3.3% |
| Other values (65) | 30668501 |
VEHICLE_MAKE
Text
MISSING 
| Distinct | 13416 |
|---|---|
| Distinct (%) | 0.6% |
| Missing | 1899905 |
| Missing (%) | 44.5% |
| Memory size | 32.6 MiB |
Length
| Max length | 53 |
|---|---|
| Median length | 13 |
| Mean length | 12.687445 |
| Min length | 1 |
Characters and Unicode
| Total characters | 30123573 |
|---|---|
| Distinct characters | 80 |
| Distinct categories | 12 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 9390 ? |
|---|---|
| Unique (%) | 0.4% |
Sample
| 1st row | TOYT -CAR/SUV |
|---|---|
| 2nd row | MERZ -CAR/SUV |
| 3rd row | FRHT-TRUCK/BUS |
| 4th row | FORD -CAR/SUV |
| 5th row | VOLK -CAR/SUV |
| Value | Count | Frequency (%) |
| car/suv | 2134682 | |
| toyt | 405262 | 8.9% |
| hond | 295179 | 6.5% |
| niss | 239360 | 5.3% |
| ford | 206558 | 4.5% |
| chev | 113785 | 2.5% |
| hyun | 84478 | 1.9% |
| bmw | 80804 | 1.8% |
| merz | 78675 | 1.7% |
| jeep | 77851 | 1.7% |
| Other values (7080) | 840575 | 18.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| S | 2901524 | |
| R | 2746430 | |
| U | 2636985 | 8.8% |
| C | 2599265 | 8.6% |
| A | 2381201 | 7.9% |
| V | 2326931 | 7.7% |
| - | 2280689 | 7.6% |
| / | 2269508 | 7.5% |
| 2182927 | 7.2% | |
| O | 1091946 | 3.6% |
| Other values (70) | 6706167 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 23226799 | |
| Dash Punctuation | 2280689 | 7.6% |
| Other Punctuation | 2270323 | 7.5% |
| Space Separator | 2182927 | 7.2% |
| Lowercase Letter | 158301 | 0.5% |
| Decimal Number | 4085 | < 0.1% |
| Open Punctuation | 224 | < 0.1% |
| Close Punctuation | 218 | < 0.1% |
| Math Symbol | 3 | < 0.1% |
| Control | 2 | < 0.1% |
| Other values (2) | 2 | < 0.1% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 2901524 | |
| R | 2746430 | |
| U | 2636985 | |
| C | 2599265 | |
| A | 2381201 | |
| V | 2326931 | |
| O | 1091946 | 4.7% |
| T | 1046843 | 4.5% |
| N | 775895 | 3.3% |
| D | 749148 | 3.2% |
| Other values (16) | 3970631 |
Lowercase Letter
| Value | Count | Frequency (%) |
| r | 16867 | 10.7% |
| e | 15502 | 9.8% |
| n | 14399 | 9.1% |
| o | 14363 | 9.1% |
| i | 12105 | 7.6% |
| a | 11003 | 7.0% |
| t | 9440 | 6.0% |
| l | 7275 | 4.6% |
| u | 5933 | 3.7% |
| c | 5677 | 3.6% |
| Other values (16) | 45737 |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 2269508 | |
| . | 451 | < 0.1% |
| , | 174 | < 0.1% |
| \ | 64 | < 0.1% |
| # | 34 | < 0.1% |
| & | 32 | < 0.1% |
| ' | 28 | < 0.1% |
| ? | 17 | < 0.1% |
| ; | 11 | < 0.1% |
| : | 4 | < 0.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 944 | |
| 1 | 623 | |
| 5 | 575 | |
| 9 | 523 | |
| 2 | 394 | |
| 3 | 280 | 6.9% |
| 7 | 245 | 6.0% |
| 4 | 219 | 5.4% |
| 6 | 171 | 4.2% |
| 8 | 111 | 2.7% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 2280689 |
Space Separator
| Value | Count | Frequency (%) |
| 2182927 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 224 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 218 |
Math Symbol
| Value | Count | Frequency (%) |
| + | 3 |
Control
| Value | Count | Frequency (%) |
| | 2 |
Modifier Symbol
| Value | Count | Frequency (%) |
| ` | 1 |
Currency Symbol
| Value | Count | Frequency (%) |
| $ | 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 23385100 | |
| Common | 6738473 | 22.4% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| S | 2901524 | |
| R | 2746430 | |
| U | 2636985 | |
| C | 2599265 | |
| A | 2381201 | |
| V | 2326931 | |
| O | 1091946 | 4.7% |
| T | 1046843 | 4.5% |
| N | 775895 | 3.3% |
| D | 749148 | 3.2% |
| Other values (42) | 4128932 |
Common
| Value | Count | Frequency (%) |
| - | 2280689 | |
| / | 2269508 | |
| 2182927 | ||
| 0 | 944 | < 0.1% |
| 1 | 623 | < 0.1% |
| 5 | 575 | < 0.1% |
| 9 | 523 | < 0.1% |
| . | 451 | < 0.1% |
| 2 | 394 | < 0.1% |
| 3 | 280 | < 0.1% |
| Other values (18) | 1559 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 30123573 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| S | 2901524 | |
| R | 2746430 | |
| U | 2636985 | 8.8% |
| C | 2599265 | 8.6% |
| A | 2381201 | 7.9% |
| V | 2326931 | 7.7% |
| - | 2280689 | 7.6% |
| / | 2269508 | 7.5% |
| 2182927 | 7.2% | |
| O | 1091946 | 3.6% |
| Other values (70) | 6706167 |
VEHICLE_MODEL
Text
MISSING 
| Distinct | 2429 |
|---|---|
| Distinct (%) | 4.7% |
| Missing | 4222807 |
| Missing (%) | 98.8% |
| Memory size | 32.6 MiB |
Length
| Max length | 25 |
|---|---|
| Median length | 8 |
| Mean length | 7.5591086 |
| Min length | 1 |
Characters and Unicode
| Total characters | 388387 |
|---|---|
| Distinct characters | 73 |
| Distinct categories | 9 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1327 ? |
|---|---|
| Unique (%) | 2.6% |
Sample
| 1st row | TOYT 4RN |
|---|---|
| 2nd row | FORD ZZZ |
| 3rd row | TRUCK TRADE |
| 4th row | DODG CHA |
| 5th row | town and country |
| Value | Count | Frequency (%) |
| zzz | 9213 | 9.7% |
| toyt | 8644 | 9.1% |
| hond | 5999 | 6.3% |
| niss | 5220 | 5.5% |
| ford | 4930 | 5.2% |
| cam | 3092 | 3.3% |
| chev | 2681 | 2.8% |
| acc | 1899 | 2.0% |
| hyun | 1575 | 1.7% |
| alt | 1532 | 1.6% |
| Other values (1769) | 50052 |
Most occurring characters
| Value | Count | Frequency (%) |
| 43457 | 11.2% | |
| Z | 32695 | 8.4% |
| T | 27048 | 7.0% |
| O | 25820 | 6.6% |
| C | 22245 | 5.7% |
| N | 21375 | 5.5% |
| S | 18775 | 4.8% |
| A | 17553 | 4.5% |
| D | 17438 | 4.5% |
| R | 16184 | 4.2% |
| Other values (63) | 145797 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 323041 | |
| Space Separator | 43457 | 11.2% |
| Decimal Number | 12683 | 3.3% |
| Lowercase Letter | 9082 | 2.3% |
| Dash Punctuation | 54 | < 0.1% |
| Other Punctuation | 54 | < 0.1% |
| Open Punctuation | 8 | < 0.1% |
| Close Punctuation | 7 | < 0.1% |
| Modifier Symbol | 1 | < 0.1% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| Z | 32695 | 10.1% |
| T | 27048 | 8.4% |
| O | 25820 | 8.0% |
| C | 22245 | 6.9% |
| N | 21375 | 6.6% |
| S | 18775 | 5.8% |
| A | 17553 | 5.4% |
| D | 17438 | 5.4% |
| R | 16184 | 5.0% |
| H | 14460 | 4.5% |
| Other values (16) | 109448 |
Lowercase Letter
| Value | Count | Frequency (%) |
| n | 995 | 11.0% |
| t | 765 | 8.4% |
| a | 748 | 8.2% |
| u | 723 | 8.0% |
| s | 649 | 7.1% |
| e | 628 | 6.9% |
| r | 600 | 6.6% |
| o | 525 | 5.8% |
| c | 458 | 5.0% |
| k | 447 | 4.9% |
| Other values (16) | 2544 |
Decimal Number
| Value | Count | Frequency (%) |
| 3 | 3067 | |
| 5 | 3005 | |
| 0 | 2394 | |
| 2 | 1344 | |
| 4 | 855 | 6.7% |
| 1 | 528 | 4.2% |
| 8 | 447 | 3.5% |
| 7 | 398 | 3.1% |
| 6 | 391 | 3.1% |
| 9 | 254 | 2.0% |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 25 | |
| . | 15 | |
| , | 6 | 11.1% |
| ? | 5 | 9.3% |
| ' | 2 | 3.7% |
| \ | 1 | 1.9% |
Space Separator
| Value | Count | Frequency (%) |
| 43457 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 54 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 8 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 7 |
Modifier Symbol
| Value | Count | Frequency (%) |
| ` | 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 332123 | |
| Common | 56264 | 14.5% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| Z | 32695 | 9.8% |
| T | 27048 | 8.1% |
| O | 25820 | 7.8% |
| C | 22245 | 6.7% |
| N | 21375 | 6.4% |
| S | 18775 | 5.7% |
| A | 17553 | 5.3% |
| D | 17438 | 5.3% |
| R | 16184 | 4.9% |
| H | 14460 | 4.4% |
| Other values (42) | 118530 |
Common
| Value | Count | Frequency (%) |
| 43457 | ||
| 3 | 3067 | 5.5% |
| 5 | 3005 | 5.3% |
| 0 | 2394 | 4.3% |
| 2 | 1344 | 2.4% |
| 4 | 855 | 1.5% |
| 1 | 528 | 0.9% |
| 8 | 447 | 0.8% |
| 7 | 398 | 0.7% |
| 6 | 391 | 0.7% |
| Other values (11) | 378 | 0.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 388387 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 43457 | 11.2% | |
| Z | 32695 | 8.4% |
| T | 27048 | 7.0% |
| O | 25820 | 6.6% |
| C | 22245 | 5.7% |
| N | 21375 | 5.5% |
| S | 18775 | 4.8% |
| A | 17553 | 4.5% |
| D | 17438 | 4.5% |
| R | 16184 | 4.2% |
| Other values (63) | 145797 |
VEHICLE_YEAR
Real number (ℝ)
MISSING  SKEWED 
| Distinct | 321 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 1921158 |
| Missing (%) | 44.9% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2015.2373 |
| Minimum | 1000 |
|---|---|
| Maximum | 20063 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 32.6 MiB |
Quantile statistics
| Minimum | 1000 |
|---|---|
| 5-th percentile | 2001 |
| Q1 | 2008 |
| median | 2014 |
| Q3 | 2017 |
| 95-th percentile | 2020 |
| Maximum | 20063 |
| Range | 19063 |
| Interquartile range (IQR) | 9 |
Descriptive statistics
| Standard deviation | 147.37309 |
|---|---|
| Coefficient of variation (CV) | 0.073129398 |
| Kurtosis | 3310.8364 |
| Mean | 2015.2373 |
| Median Absolute Deviation (MAD) | 4 |
| Skewness | 55.678309 |
| Sum | 4.7419118 × 109 |
| Variance | 21718.827 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2016 | 224770 | 5.3% |
| 2015 | 222458 | 5.2% |
| 2017 | 204686 | 4.8% |
| 2014 | 164096 | 3.8% |
| 2018 | 141693 | 3.3% |
| 2013 | 140432 | 3.3% |
| 2012 | 112251 | 2.6% |
| 2019 | 101743 | 2.4% |
| 2011 | 100797 | 2.4% |
| 2007 | 91769 | 2.1% |
| Other values (311) | 848334 | |
| (Missing) | 1921158 |
| Value | Count | Frequency (%) |
| 1000 | 1 | < 0.1% |
| 1111 | 2 | < 0.1% |
| 1900 | 7 | |
| 1920 | 2 | < 0.1% |
| 1921 | 1 | < 0.1% |
| 1923 | 1 | < 0.1% |
| 1926 | 1 | < 0.1% |
| 1930 | 1 | < 0.1% |
| 1931 | 1 | < 0.1% |
| 1932 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 20063 | 1 | < 0.1% |
| 20015 | 2 | < 0.1% |
| 20009 | 1 | < 0.1% |
| 20003 | 1 | < 0.1% |
| 19969 | 1 | < 0.1% |
| 9999 | 741 | |
| 9972 | 1 | < 0.1% |
| 9699 | 1 | < 0.1% |
| 9019 | 1 | < 0.1% |
| 8888 | 1 | < 0.1% |
TRAVEL_DIRECTION
Categorical
MISSING 
| Distinct | 15 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 1673932 |
| Missing (%) | 39.2% |
| Memory size | 32.6 MiB |
| West | |
|---|---|
| North | |
| East | |
| South | |
| Unknown | |
| Other values (10) |
Length
| Max length | 9 |
|---|---|
| Median length | 7 |
| Mean length | 4.8151939 |
| Min length | 1 |
Characters and Unicode
| Total characters | 12520732 |
|---|---|
| Distinct characters | 17 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | North |
|---|---|
| 2nd row | East |
| 3rd row | East |
| 4th row | Southwest |
| 5th row | South |
Common Values
| Value | Count | Frequency (%) |
| West | 596799 | 14.0% |
| North | 595515 | 13.9% |
| East | 595036 | 13.9% |
| South | 588175 | 13.8% |
| Unknown | 86834 | 2.0% |
| Northeast | 36270 | 0.8% |
| Southeast | 34395 | 0.8% |
| Southwest | 33640 | 0.8% |
| Northwest | 31846 | 0.7% |
| - | 1003 | < 0.1% |
| Other values (5) | 742 | < 0.1% |
| (Missing) | 1673932 |
Length
| Value | Count | Frequency (%) |
| west | 596799 | |
| north | 595515 | |
| east | 595036 | |
| south | 588175 | |
| unknown | 86834 | 3.3% |
| northeast | 36270 | 1.4% |
| southeast | 34395 | 1.3% |
| southwest | 33640 | 1.3% |
| northwest | 31846 | 1.2% |
| 1003 | < 0.1% | |
| Other values (5) | 742 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| t | 2647827 | |
| o | 1406675 | |
| s | 1327986 | |
| h | 1319841 | |
| e | 732950 | 5.9% |
| a | 665701 | 5.3% |
| N | 663808 | 5.3% |
| r | 663631 | 5.3% |
| S | 656416 | 5.2% |
| u | 656210 | 5.2% |
| Other values (7) | 1779687 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 9920477 | |
| Uppercase Letter | 2599252 | 20.8% |
| Dash Punctuation | 1003 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| t | 2647827 | |
| o | 1406675 | |
| s | 1327986 | |
| h | 1319841 | |
| e | 732950 | 7.4% |
| a | 665701 | 6.7% |
| r | 663631 | 6.7% |
| u | 656210 | 6.6% |
| n | 260502 | 2.6% |
| w | 152320 | 1.5% |
Uppercase Letter
| Value | Count | Frequency (%) |
| N | 663808 | |
| S | 656416 | |
| W | 596972 | |
| E | 595195 | |
| U | 86861 | 3.3% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 1003 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 12519729 | |
| Common | 1003 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| t | 2647827 | |
| o | 1406675 | |
| s | 1327986 | |
| h | 1319841 | |
| e | 732950 | 5.9% |
| a | 665701 | 5.3% |
| N | 663808 | 5.3% |
| r | 663631 | 5.3% |
| S | 656416 | 5.2% |
| u | 656210 | 5.2% |
| Other values (6) | 1778684 |
Common
| Value | Count | Frequency (%) |
| - | 1003 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 12520732 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| t | 2647827 | |
| o | 1406675 | |
| s | 1327986 | |
| h | 1319841 | |
| e | 732950 | 5.9% |
| a | 665701 | 5.3% |
| N | 663808 | 5.3% |
| r | 663631 | 5.3% |
| S | 656416 | 5.2% |
| u | 656210 | 5.2% |
| Other values (7) | 1779687 |
VEHICLE_OCCUPANTS
Real number (ℝ)
MISSING  SKEWED  ZEROS 
| Distinct | 135 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 1793977 |
| Missing (%) | 42.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1104.1254 |
| Minimum | 0 |
|---|---|
| Maximum | 1 × 109 |
| Zeros | 428660 |
| Zeros (%) | 10.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 32.6 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 1 |
| Q3 | 1 |
| 95-th percentile | 3 |
| Maximum | 1 × 109 |
| Range | 1 × 109 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 944251.22 |
|---|---|
| Coefficient of variation (CV) | 855.2029 |
| Kurtosis | 1001332.9 |
| Mean | 1104.1254 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 980.79309 |
| Sum | 2.7384628 × 109 |
| Variance | 8.9161037 × 1011 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 1562334 | |
| 0 | 428660 | 10.0% |
| 2 | 328343 | 7.7% |
| 3 | 94491 | 2.2% |
| 4 | 39164 | 0.9% |
| 5 | 14353 | 0.3% |
| 6 | 4748 | 0.1% |
| 7 | 2093 | < 0.1% |
| 8 | 1251 | < 0.1% |
| 9 | 875 | < 0.1% |
| Other values (125) | 3898 | 0.1% |
| (Missing) | 1793977 |
| Value | Count | Frequency (%) |
| 0 | 428660 | 10.0% |
| 1 | 1562334 | |
| 2 | 328343 | 7.7% |
| 3 | 94491 | 2.2% |
| 4 | 39164 | 0.9% |
| 5 | 14353 | 0.3% |
| 6 | 4748 | 0.1% |
| 7 | 2093 | < 0.1% |
| 8 | 1251 | < 0.1% |
| 9 | 875 | < 0.1% |
| Value | Count | Frequency (%) |
| 999999999 | 1 | < 0.1% |
| 981990849 | 1 | < 0.1% |
| 456817715 | 1 | < 0.1% |
| 167820107 | 1 | < 0.1% |
| 99999999 | 1 | < 0.1% |
| 9999999 | 2 | |
| 5292023 | 1 | < 0.1% |
| 999999 | 3 | |
| 99999 | 4 | |
| 24260 | 1 | < 0.1% |
DRIVER_SEX
Categorical
MISSING 
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 2252000 |
| Missing (%) | 52.7% |
| Memory size | 32.6 MiB |
| M | |
|---|---|
| F | |
| U | 8469 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 2022187 |
|---|---|
| Distinct characters | 3 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | M |
|---|---|
| 2nd row | M |
| 3rd row | M |
| 4th row | F |
| 5th row | M |
Common Values
| Value | Count | Frequency (%) |
| M | 1496794 | |
| F | 516924 | 12.1% |
| U | 8469 | 0.2% |
| (Missing) | 2252000 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| m | 1496794 | |
| f | 516924 | 25.6% |
| u | 8469 | 0.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| M | 1496794 | |
| F | 516924 | 25.6% |
| U | 8469 | 0.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 2022187 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| M | 1496794 | |
| F | 516924 | 25.6% |
| U | 8469 | 0.4% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 2022187 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| M | 1496794 | |
| F | 516924 | 25.6% |
| U | 8469 | 0.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2022187 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| M | 1496794 | |
| F | 516924 | 25.6% |
| U | 8469 | 0.4% |
DRIVER_LICENSE_STATUS
Categorical
IMBALANCE  MISSING 
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 2346168 |
| Missing (%) | 54.9% |
| Memory size | 32.6 MiB |
| Licensed | |
|---|---|
| Unlicensed | 39376 |
| Permit | 17799 |
Length
| Max length | 10 |
|---|---|
| Median length | 8 |
| Mean length | 8.0223826 |
| Min length | 6 |
Characters and Unicode
| Total characters | 15467306 |
|---|---|
| Distinct characters | 13 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Licensed |
|---|---|
| 2nd row | Licensed |
| 3rd row | Licensed |
| 4th row | Licensed |
| 5th row | Licensed |
Common Values
| Value | Count | Frequency (%) |
| Licensed | 1870844 | |
| Unlicensed | 39376 | 0.9% |
| Permit | 17799 | 0.4% |
| (Missing) | 2346168 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| licensed | 1870844 | |
| unlicensed | 39376 | 2.0% |
| permit | 17799 | 0.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 3838239 | |
| n | 1949596 | |
| i | 1928019 | |
| c | 1910220 | |
| s | 1910220 | |
| d | 1910220 | |
| L | 1870844 | |
| U | 39376 | 0.3% |
| l | 39376 | 0.3% |
| P | 17799 | 0.1% |
| Other values (3) | 53397 | 0.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 13539287 | |
| Uppercase Letter | 1928019 | 12.5% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 3838239 | |
| n | 1949596 | |
| i | 1928019 | |
| c | 1910220 | |
| s | 1910220 | |
| d | 1910220 | |
| l | 39376 | 0.3% |
| r | 17799 | 0.1% |
| m | 17799 | 0.1% |
| t | 17799 | 0.1% |
Uppercase Letter
| Value | Count | Frequency (%) |
| L | 1870844 | |
| U | 39376 | 2.0% |
| P | 17799 | 0.9% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 15467306 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 3838239 | |
| n | 1949596 | |
| i | 1928019 | |
| c | 1910220 | |
| s | 1910220 | |
| d | 1910220 | |
| L | 1870844 | |
| U | 39376 | 0.3% |
| l | 39376 | 0.3% |
| P | 17799 | 0.1% |
| Other values (3) | 53397 | 0.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 15467306 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 3838239 | |
| n | 1949596 | |
| i | 1928019 | |
| c | 1910220 | |
| s | 1910220 | |
| d | 1910220 | |
| L | 1870844 | |
| U | 39376 | 0.3% |
| l | 39376 | 0.3% |
| P | 17799 | 0.1% |
| Other values (3) | 53397 | 0.3% |
DRIVER_LICENSE_JURISDICTION
Text
MISSING 
| Distinct | 72 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 2342040 |
| Missing (%) | 54.8% |
| Memory size | 32.6 MiB |
Length
| Max length | 8 |
|---|---|
| Median length | 2 |
| Mean length | 2.0027783 |
| Min length | 2 |
Characters and Unicode
| Total characters | 3869662 |
|---|---|
| Distinct characters | 30 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 6 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | NY |
|---|---|
| 2nd row | FL |
| 3rd row | NY |
| 4th row | NY |
| 5th row | NY |
| Value | Count | Frequency (%) |
| ny | 1664726 | |
| nj | 109625 | 5.7% |
| pa | 34490 | 1.8% |
| ct | 21443 | 1.1% |
| fl | 20525 | 1.1% |
| md | 10981 | 0.6% |
| nc | 6845 | 0.4% |
| ma | 6459 | 0.3% |
| ga | 6446 | 0.3% |
| va | 6352 | 0.3% |
| Other values (61) | 44255 | 2.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| N | 1788277 | |
| Y | 1665131 | |
| J | 109626 | 2.8% |
| A | 64075 | 1.7% |
| C | 36636 | 0.9% |
| P | 35152 | 0.9% |
| T | 26327 | 0.7% |
| L | 23673 | 0.6% |
| M | 22375 | 0.6% |
| F | 20862 | 0.5% |
| Other values (20) | 77528 | 2.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 3869563 | |
| Decimal Number | 96 | < 0.1% |
| Other Punctuation | 2 | < 0.1% |
| Lowercase Letter | 1 | < 0.1% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| N | 1788277 | |
| Y | 1665131 | |
| J | 109626 | 2.8% |
| A | 64075 | 1.7% |
| C | 36636 | 0.9% |
| P | 35152 | 0.9% |
| T | 26327 | 0.7% |
| L | 23673 | 0.6% |
| M | 22375 | 0.6% |
| F | 20862 | 0.5% |
| Other values (16) | 77429 | 2.0% |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 1 | |
| ' | 1 |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 96 |
Lowercase Letter
| Value | Count | Frequency (%) |
| q | 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 3869564 | |
| Common | 98 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| N | 1788277 | |
| Y | 1665131 | |
| J | 109626 | 2.8% |
| A | 64075 | 1.7% |
| C | 36636 | 0.9% |
| P | 35152 | 0.9% |
| T | 26327 | 0.7% |
| L | 23673 | 0.6% |
| M | 22375 | 0.6% |
| F | 20862 | 0.5% |
| Other values (17) | 77430 | 2.0% |
Common
| Value | Count | Frequency (%) |
| 1 | 96 | |
| , | 1 | 1.0% |
| ' | 1 | 1.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3869662 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| N | 1788277 | |
| Y | 1665131 | |
| J | 109626 | 2.8% |
| A | 64075 | 1.7% |
| C | 36636 | 0.9% |
| P | 35152 | 0.9% |
| T | 26327 | 0.7% |
| L | 23673 | 0.6% |
| M | 22375 | 0.6% |
| F | 20862 | 0.5% |
| Other values (20) | 77528 | 2.0% |
PRE_CRASH
Categorical
MISSING 
| Distinct | 19 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 928192 |
| Missing (%) | 21.7% |
| Memory size | 32.6 MiB |
| Going Straight Ahead | |
|---|---|
| Parked | |
| Making Left Turn | |
| Making Right Turn | |
| Stopped in Traffic | 152990 |
| Other values (14) |
Length
| Max length | 26 |
|---|---|
| Median length | 24 |
| Mean length | 15.949931 |
| Min length | 6 |
Characters and Unicode
| Total characters | 53368388 |
|---|---|
| Distinct characters | 38 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Going Straight Ahead |
|---|---|
| 2nd row | Going Straight Ahead |
| 3rd row | Parked |
| 4th row | Merging |
| 5th row | Parked |
Common Values
| Value | Count | Frequency (%) |
| Going Straight Ahead | 1640811 | |
| Parked | 578642 | 13.5% |
| Making Left Turn | 206267 | 4.8% |
| Making Right Turn | 169138 | 4.0% |
| Stopped in Traffic | 152990 | 3.6% |
| Slowing or Stopping | 117145 | 2.7% |
| Backing | 113157 | 2.6% |
| Changing Lanes | 97298 | 2.3% |
| Merging | 54183 | 1.3% |
| Starting from Parking | 54176 | 1.3% |
| Other values (9) | 162188 | 3.8% |
| (Missing) | 928192 |
Length
| Value | Count | Frequency (%) |
| going | 1640811 | |
| straight | 1640811 | |
| ahead | 1640811 | |
| parked | 620199 | 7.5% |
| making | 406847 | 4.9% |
| turn | 406847 | 4.9% |
| left | 207287 | 2.5% |
| right | 170040 | 2.0% |
| in | 169825 | 2.0% |
| traffic | 166185 | 2.0% |
| Other values (23) | 1243476 |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 4985275 | 9.3% |
| 4967144 | 9.3% | |
| a | 4946483 | 9.3% |
| g | 4710712 | 8.8% |
| t | 4187884 | 7.8% |
| n | 3604684 | 6.8% |
| h | 3584624 | 6.7% |
| r | 3259954 | 6.1% |
| e | 2857191 | 5.4% |
| d | 2423202 | 4.5% |
| Other values (28) | 13841235 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 40395509 | |
| Uppercase Letter | 7970071 | 14.9% |
| Space Separator | 4967144 | 9.3% |
| Other Punctuation | 35664 | 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| i | 4985275 | |
| a | 4946483 | |
| g | 4710712 | |
| t | 4187884 | |
| n | 3604684 | |
| h | 3584624 | |
| r | 3259954 | |
| e | 2857191 | |
| d | 2423202 | |
| o | 2293368 | |
| Other values (13) | 3542132 |
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 2095462 | |
| A | 1644451 | |
| G | 1640811 | |
| P | 754262 | 9.5% |
| T | 573032 | 7.2% |
| M | 461030 | 5.8% |
| L | 304585 | 3.8% |
| R | 175602 | 2.2% |
| B | 113157 | 1.4% |
| C | 97298 | 1.2% |
| Other values (3) | 110381 | 1.4% |
Space Separator
| Value | Count | Frequency (%) |
| 4967144 |
Other Punctuation
| Value | Count | Frequency (%) |
| * | 35664 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 48365580 | |
| Common | 5002808 | 9.4% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| i | 4985275 | |
| a | 4946483 | |
| g | 4710712 | |
| t | 4187884 | 8.7% |
| n | 3604684 | 7.5% |
| h | 3584624 | 7.4% |
| r | 3259954 | 6.7% |
| e | 2857191 | 5.9% |
| d | 2423202 | 5.0% |
| o | 2293368 | 4.7% |
| Other values (26) | 11512203 |
Common
| Value | Count | Frequency (%) |
| 4967144 | ||
| * | 35664 | 0.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 53368388 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| i | 4985275 | 9.3% |
| 4967144 | 9.3% | |
| a | 4946483 | 9.3% |
| g | 4710712 | 8.8% |
| t | 4187884 | 7.8% |
| n | 3604684 | 6.8% |
| h | 3584624 | 6.7% |
| r | 3259954 | 6.1% |
| e | 2857191 | 5.4% |
| d | 2423202 | 4.5% |
| Other values (28) | 13841235 |
POINT_OF_IMPACT
Categorical
MISSING 
| Distinct | 19 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 1707717 |
| Missing (%) | 40.0% |
| Memory size | 32.6 MiB |
| Center Front End | |
|---|---|
| Left Front Bumper | |
| Center Back End | |
| Right Front Bumper | |
| Right Front Quarter Panel | |
| Other values (14) |
Length
| Max length | 25 |
|---|---|
| Median length | 23 |
| Mean length | 17.759805 |
| Min length | 4 |
Characters and Unicode
| Total characters | 45580007 |
|---|---|
| Distinct characters | 34 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Left Front Bumper |
|---|---|
| 2nd row | Right Front Bumper |
| 3rd row | Left Front Quarter Panel |
| 4th row | Center Front End |
| 5th row | Right Rear Bumper |
Common Values
| Value | Count | Frequency (%) |
| Center Front End | 447752 | 10.5% |
| Left Front Bumper | 325169 | 7.6% |
| Center Back End | 309316 | 7.2% |
| Right Front Bumper | 286673 | 6.7% |
| Right Front Quarter Panel | 182339 | 4.3% |
| Left Front Quarter Panel | 179769 | 4.2% |
| Left Rear Quarter Panel | 146880 | 3.4% |
| Left Side Doors | 135730 | 3.2% |
| Left Rear Bumper | 134669 | 3.2% |
| Right Side Doors | 113249 | 2.6% |
| Other values (9) | 304924 | 7.1% |
| (Missing) | 1707717 |
Length
| Value | Count | Frequency (%) |
| front | 1421702 | |
| left | 922217 | |
| bumper | 835722 | |
| right | 775807 | |
| center | 757068 | |
| end | 757068 | |
| quarter | 613323 | |
| panel | 613323 | |
| rear | 475095 | 5.8% |
| back | 309316 | 3.8% |
| Other values (10) | 665140 |
Most occurring characters
| Value | Count | Frequency (%) |
| 5579311 | ||
| e | 5334109 | |
| r | 5023527 | 11.0% |
| t | 4533104 | 9.9% |
| n | 3552695 | 7.8% |
| a | 2130218 | 4.7% |
| o | 1987298 | 4.4% |
| u | 1450923 | 3.2% |
| F | 1421702 | 3.1% |
| R | 1256046 | 2.8% |
| Other values (24) | 13311074 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 31854915 | |
| Uppercase Letter | 8145781 | 17.9% |
| Space Separator | 5579311 | 12.2% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 5334109 | |
| r | 5023527 | |
| t | 4533104 | |
| n | 3552695 | |
| a | 2130218 | 6.7% |
| o | 1987298 | 6.2% |
| u | 1450923 | 4.6% |
| i | 1032229 | 3.2% |
| d | 1011127 | 3.2% |
| f | 927361 | 2.9% |
| Other values (9) | 4872324 |
Uppercase Letter
| Value | Count | Frequency (%) |
| F | 1421702 | |
| R | 1256046 | |
| B | 1145038 | |
| L | 922217 | |
| C | 757068 | |
| E | 757068 | |
| P | 613323 | |
| Q | 613323 | |
| D | 306329 | 3.8% |
| S | 248979 | 3.1% |
| Other values (4) | 104688 | 1.3% |
Space Separator
| Value | Count | Frequency (%) |
| 5579311 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 40000696 | |
| Common | 5579311 | 12.2% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 5334109 | |
| r | 5023527 | |
| t | 4533104 | 11.3% |
| n | 3552695 | 8.9% |
| a | 2130218 | 5.3% |
| o | 1987298 | 5.0% |
| u | 1450923 | 3.6% |
| F | 1421702 | 3.6% |
| R | 1256046 | 3.1% |
| B | 1145038 | 2.9% |
| Other values (23) | 12166036 |
Common
| Value | Count | Frequency (%) |
| 5579311 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 45580007 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 5579311 | ||
| e | 5334109 | |
| r | 5023527 | 11.0% |
| t | 4533104 | 9.9% |
| n | 3552695 | 7.8% |
| a | 2130218 | 4.7% |
| o | 1987298 | 4.4% |
| u | 1450923 | 3.2% |
| F | 1421702 | 3.1% |
| R | 1256046 | 2.8% |
| Other values (24) | 13311074 |
VEHICLE_DAMAGE
Categorical
MISSING 
| Distinct | 19 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 1733402 |
| Missing (%) | 40.6% |
| Memory size | 32.6 MiB |
| Center Front End | |
|---|---|
| Left Front Bumper | |
| Center Back End | |
| Right Front Bumper | |
| No Damage | |
| Other values (14) |
Length
| Max length | 25 |
|---|---|
| Median length | 23 |
| Mean length | 17.099184 |
| Min length | 4 |
Characters and Unicode
| Total characters | 43445349 |
|---|---|
| Distinct characters | 34 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Left Front Quarter Panel |
|---|---|
| 2nd row | Right Front Bumper |
| 3rd row | Left Front Quarter Panel |
| 4th row | Center Front End |
| 5th row | Right Rear Bumper |
Common Values
| Value | Count | Frequency (%) |
| Center Front End | 398007 | 9.3% |
| Left Front Bumper | 267625 | 6.3% |
| Center Back End | 264288 | 6.2% |
| Right Front Bumper | 245725 | 5.7% |
| No Damage | 241879 | 5.7% |
| Left Front Quarter Panel | 176340 | 4.1% |
| Right Front Quarter Panel | 171346 | 4.0% |
| Left Rear Quarter Panel | 140224 | 3.3% |
| Left Side Doors | 139861 | 3.3% |
| Left Rear Bumper | 128920 | 3.0% |
| Other values (9) | 366570 | 8.6% |
| (Missing) | 1733402 |
Length
| Value | Count | Frequency (%) |
| front | 1259043 | |
| left | 852970 | |
| bumper | 731608 | |
| right | 718708 | |
| center | 662295 | |
| end | 662295 | |
| quarter | 583751 | |
| panel | 583751 | |
| rear | 454323 | 5.8% |
| back | 264288 | 3.4% |
| Other values (10) | 1061329 |
Most occurring characters
| Value | Count | Frequency (%) |
| 5293576 | ||
| e | 5098545 | 11.7% |
| r | 4596523 | 10.6% |
| t | 4127243 | 9.5% |
| n | 3172571 | 7.3% |
| a | 2377016 | 5.5% |
| o | 2028301 | 4.7% |
| u | 1318286 | 3.0% |
| F | 1259043 | 2.9% |
| R | 1178200 | 2.7% |
| Other values (24) | 12996045 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 30317412 | |
| Uppercase Letter | 7834361 | 18.0% |
| Space Separator | 5293576 | 12.2% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 5098545 | |
| r | 4596523 | |
| t | 4127243 | |
| n | 3172571 | |
| a | 2377016 | |
| o | 2028301 | 6.7% |
| u | 1318286 | 4.3% |
| i | 984315 | 3.2% |
| m | 977890 | 3.2% |
| g | 962847 | 3.2% |
| Other values (9) | 4673875 |
Uppercase Letter
| Value | Count | Frequency (%) |
| F | 1259043 | |
| R | 1178200 | |
| B | 995896 | |
| L | 852970 | |
| C | 662295 | |
| E | 662295 | |
| Q | 583751 | |
| P | 583751 | |
| D | 502601 | 6.4% |
| S | 256319 | 3.3% |
| Other values (4) | 297240 | 3.8% |
Space Separator
| Value | Count | Frequency (%) |
| 5293576 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 38151773 | |
| Common | 5293576 | 12.2% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 5098545 | |
| r | 4596523 | 12.0% |
| t | 4127243 | 10.8% |
| n | 3172571 | 8.3% |
| a | 2377016 | 6.2% |
| o | 2028301 | 5.3% |
| u | 1318286 | 3.5% |
| F | 1259043 | 3.3% |
| R | 1178200 | 3.1% |
| B | 995896 | 2.6% |
| Other values (23) | 12000149 |
Common
| Value | Count | Frequency (%) |
| 5293576 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 43445349 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 5293576 | ||
| e | 5098545 | 11.7% |
| r | 4596523 | 10.6% |
| t | 4127243 | 9.5% |
| n | 3172571 | 7.3% |
| a | 2377016 | 5.5% |
| o | 2028301 | 4.7% |
| u | 1318286 | 3.0% |
| F | 1259043 | 2.9% |
| R | 1178200 | 2.7% |
| Other values (24) | 12996045 |
VEHICLE_DAMAGE_1
Categorical
MISSING 
| Distinct | 19 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 2633928 |
| Missing (%) | 61.6% |
| Memory size | 32.6 MiB |
| No Damage | |
|---|---|
| Left Front Bumper | |
| Center Front End | |
| Right Front Bumper | |
| Left Front Quarter Panel | |
| Other values (14) |
Length
| Max length | 25 |
|---|---|
| Median length | 23 |
| Mean length | 15.69047 |
| Min length | 4 |
Characters and Unicode
| Total characters | 25736435 |
|---|---|
| Distinct characters | 34 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Right Front Quarter Panel |
|---|---|
| 2nd row | No Damage |
| 3rd row | Center Back End |
| 4th row | Left Rear Quarter Panel |
| 5th row | Right Front Quarter Panel |
Common Values
| Value | Count | Frequency (%) |
| No Damage | 456189 | 10.7% |
| Left Front Bumper | 163154 | 3.8% |
| Center Front End | 155283 | 3.6% |
| Right Front Bumper | 130616 | 3.1% |
| Left Front Quarter Panel | 103732 | 2.4% |
| Right Front Quarter Panel | 95061 | 2.2% |
| Left Rear Bumper | 85759 | 2.0% |
| Right Rear Bumper | 79907 | 1.9% |
| Left Rear Quarter Panel | 73990 | 1.7% |
| Left Side Doors | 73733 | 1.7% |
| Other values (9) | 222835 | 5.2% |
| (Missing) | 2633928 |
Length
| Value | Count | Frequency (%) |
| front | 647846 | |
| left | 500368 | |
| bumper | 459436 | |
| no | 456189 | |
| damage | 456189 | |
| right | 427964 | |
| quarter | 330465 | |
| panel | 330465 | |
| rear | 297338 | |
| end | 220874 | 4.7% |
| Other values (10) | 598191 |
Most occurring characters
| Value | Count | Frequency (%) |
| 3085066 | ||
| e | 2993938 | 11.6% |
| r | 2461376 | 9.6% |
| t | 2154601 | 8.4% |
| a | 1941534 | 7.5% |
| n | 1423786 | 5.5% |
| o | 1387831 | 5.4% |
| m | 918541 | 3.6% |
| g | 886595 | 3.4% |
| u | 791186 | 3.1% |
| Other values (24) | 7691981 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 17926044 | |
| Uppercase Letter | 4725325 | 18.4% |
| Space Separator | 3085066 | 12.0% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 2993938 | |
| r | 2461376 | |
| t | 2154601 | |
| a | 1941534 | |
| n | 1423786 | |
| o | 1387831 | |
| m | 918541 | 5.1% |
| g | 886595 | 4.9% |
| u | 791186 | 4.4% |
| i | 572166 | 3.2% |
| Other values (9) | 2394490 |
Uppercase Letter
| Value | Count | Frequency (%) |
| R | 727311 | |
| F | 647846 | |
| D | 597536 | |
| B | 525027 | |
| L | 500368 | |
| N | 456189 | |
| P | 330465 | |
| Q | 330465 | |
| C | 220874 | 4.7% |
| E | 220874 | 4.7% |
| Other values (4) | 168370 | 3.6% |
Space Separator
| Value | Count | Frequency (%) |
| 3085066 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 22651369 | |
| Common | 3085066 | 12.0% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 2993938 | |
| r | 2461376 | 10.9% |
| t | 2154601 | 9.5% |
| a | 1941534 | 8.6% |
| n | 1423786 | 6.3% |
| o | 1387831 | 6.1% |
| m | 918541 | 4.1% |
| g | 886595 | 3.9% |
| u | 791186 | 3.5% |
| R | 727311 | 3.2% |
| Other values (23) | 6964670 |
Common
| Value | Count | Frequency (%) |
| 3085066 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 25736435 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 3085066 | ||
| e | 2993938 | 11.6% |
| r | 2461376 | 9.6% |
| t | 2154601 | 8.4% |
| a | 1941534 | 7.5% |
| n | 1423786 | 5.5% |
| o | 1387831 | 5.4% |
| m | 918541 | 3.6% |
| g | 886595 | 3.4% |
| u | 791186 | 3.1% |
| Other values (24) | 7691981 |
VEHICLE_DAMAGE_2
Categorical
MISSING 
| Distinct | 19 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 3034451 |
| Missing (%) | 71.0% |
| Memory size | 32.6 MiB |
| No Damage | |
|---|---|
| Right Front Bumper | |
| Left Front Bumper | |
| Center Front End | |
| Left Rear Bumper | |
| Other values (14) |
Length
| Max length | 25 |
|---|---|
| Median length | 24 |
| Mean length | 13.722874 |
| Min length | 4 |
Characters and Unicode
| Total characters | 17012741 |
|---|---|
| Distinct characters | 34 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | No Damage |
|---|---|
| 2nd row | Left Rear Bumper |
| 3rd row | Right Front Bumper |
| 4th row | No Damage |
| 5th row | No Damage |
Common Values
| Value | Count | Frequency (%) |
| No Damage | 586304 | 13.7% |
| Right Front Bumper | 123314 | 2.9% |
| Left Front Bumper | 73650 | 1.7% |
| Center Front End | 62140 | 1.5% |
| Left Rear Bumper | 60839 | 1.4% |
| Left Front Quarter Panel | 46624 | 1.1% |
| Right Rear Bumper | 43800 | 1.0% |
| Right Front Quarter Panel | 40766 | 1.0% |
| Left Rear Quarter Panel | 39020 | 0.9% |
| Right Rear Quarter Panel | 38483 | 0.9% |
| Other values (9) | 124796 | 2.9% |
| (Missing) | 3034451 |
Length
| Value | Count | Frequency (%) |
| no | 586304 | |
| damage | 586304 | |
| front | 346494 | |
| bumper | 301603 | |
| right | 273192 | |
| left | 251380 | |
| rear | 182142 | 5.6% |
| quarter | 164893 | 5.1% |
| panel | 164893 | 5.1% |
| end | 95203 | 2.9% |
| Other values (10) | 278075 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1990747 | ||
| e | 1936980 | |
| a | 1722268 | 10.1% |
| r | 1349248 | 7.9% |
| t | 1159145 | 6.8% |
| o | 1053551 | 6.2% |
| m | 889792 | 5.2% |
| g | 861734 | 5.1% |
| n | 704892 | 4.1% |
| D | 646265 | 3.8% |
| Other values (24) | 4698119 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 11791511 | |
| Uppercase Letter | 3230483 | 19.0% |
| Space Separator | 1990747 | 11.7% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 1936980 | |
| a | 1722268 | |
| r | 1349248 | |
| t | 1159145 | |
| o | 1053551 | |
| m | 889792 | |
| g | 861734 | |
| n | 704892 | 6.0% |
| u | 467357 | 4.0% |
| i | 335584 | 2.8% |
| Other values (9) | 1310960 |
Uppercase Letter
| Value | Count | Frequency (%) |
| D | 646265 | |
| N | 586304 | |
| R | 456692 | |
| F | 346494 | |
| B | 334666 | |
| L | 251380 | 7.8% |
| Q | 164893 | 5.1% |
| P | 164893 | 5.1% |
| C | 95203 | 2.9% |
| E | 95203 | 2.9% |
| Other values (4) | 88490 | 2.7% |
Space Separator
| Value | Count | Frequency (%) |
| 1990747 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 15021994 | |
| Common | 1990747 | 11.7% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 1936980 | |
| a | 1722268 | |
| r | 1349248 | 9.0% |
| t | 1159145 | 7.7% |
| o | 1053551 | 7.0% |
| m | 889792 | 5.9% |
| g | 861734 | 5.7% |
| n | 704892 | 4.7% |
| D | 646265 | 4.3% |
| N | 586304 | 3.9% |
| Other values (23) | 4111815 |
Common
| Value | Count | Frequency (%) |
| 1990747 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 17012741 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1990747 | ||
| e | 1936980 | |
| a | 1722268 | 10.1% |
| r | 1349248 | 7.9% |
| t | 1159145 | 6.8% |
| o | 1053551 | 6.2% |
| m | 889792 | 5.2% |
| g | 861734 | 5.1% |
| n | 704892 | 4.1% |
| D | 646265 | 3.8% |
| Other values (24) | 4698119 |
VEHICLE_DAMAGE_3
Categorical
IMBALANCE  MISSING 
| Distinct | 19 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 3320488 |
| Missing (%) | 77.7% |
| Memory size | 32.6 MiB |
| No Damage | |
|---|---|
| Center Front End | 33747 |
| Other | 32688 |
| Right Front Bumper | 27348 |
| Left Front Bumper | 26560 |
| Other values (14) |
Length
| Max length | 25 |
|---|---|
| Median length | 9 |
| Mean length | 11.337945 |
| Min length | 4 |
Characters and Unicode
| Total characters | 10812987 |
|---|---|
| Distinct characters | 34 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | No Damage |
|---|---|
| 2nd row | No Damage |
| 3rd row | No Damage |
| 4th row | No Damage |
| 5th row | No Damage |
Common Values
| Value | Count | Frequency (%) |
| No Damage | 675522 | 15.8% |
| Center Front End | 33747 | 0.8% |
| Other | 32688 | 0.8% |
| Right Front Bumper | 27348 | 0.6% |
| Left Front Bumper | 26560 | 0.6% |
| Left Front Quarter Panel | 25385 | 0.6% |
| Right Front Quarter Panel | 22265 | 0.5% |
| Center Back End | 18604 | 0.4% |
| Left Rear Quarter Panel | 16243 | 0.4% |
| Left Rear Bumper | 15872 | 0.4% |
| Other values (9) | 59465 | 1.4% |
| (Missing) | 3320488 |
Length
| Value | Count | Frequency (%) |
| no | 675522 | |
| damage | 675522 | |
| front | 135305 | 6.2% |
| left | 96841 | 4.4% |
| right | 87386 | 4.0% |
| bumper | 82125 | 3.8% |
| quarter | 78128 | 3.6% |
| panel | 78128 | 3.6% |
| rear | 58695 | 2.7% |
| end | 52351 | 2.4% |
| Other values (10) | 160502 | 7.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 1592142 | |
| e | 1245686 | |
| 1226806 | ||
| o | 864223 | |
| g | 766590 | 7.1% |
| m | 760335 | 7.0% |
| D | 702184 | 6.5% |
| N | 675522 | 6.2% |
| r | 554762 | 5.1% |
| t | 483681 | 4.5% |
| Other values (24) | 1941056 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 7405676 | |
| Uppercase Letter | 2180505 | 20.2% |
| Space Separator | 1226806 | 11.3% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 1592142 | |
| e | 1245686 | |
| o | 864223 | |
| g | 766590 | |
| m | 760335 | |
| r | 554762 | 7.5% |
| t | 483681 | 6.5% |
| n | 322799 | 4.4% |
| u | 161235 | 2.2% |
| h | 122762 | 1.7% |
| Other values (9) | 531461 | 7.2% |
Uppercase Letter
| Value | Count | Frequency (%) |
| D | 702184 | |
| N | 675522 | |
| R | 147461 | 6.8% |
| F | 135305 | 6.2% |
| B | 100729 | 4.6% |
| L | 96841 | 4.4% |
| Q | 78128 | 3.6% |
| P | 78128 | 3.6% |
| E | 52351 | 2.4% |
| C | 52351 | 2.4% |
| Other values (4) | 61505 | 2.8% |
Space Separator
| Value | Count | Frequency (%) |
| 1226806 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 9586181 | |
| Common | 1226806 | 11.3% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 1592142 | |
| e | 1245686 | |
| o | 864223 | |
| g | 766590 | |
| m | 760335 | |
| D | 702184 | |
| N | 675522 | |
| r | 554762 | 5.8% |
| t | 483681 | 5.0% |
| n | 322799 | 3.4% |
| Other values (23) | 1618257 |
Common
| Value | Count | Frequency (%) |
| 1226806 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 10812987 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 1592142 | |
| e | 1245686 | |
| 1226806 | ||
| o | 864223 | |
| g | 766590 | 7.1% |
| m | 760335 | 7.0% |
| D | 702184 | 6.5% |
| N | 675522 | 6.2% |
| r | 554762 | 5.1% |
| t | 483681 | 4.5% |
| Other values (24) | 1941056 |
PUBLIC_PROPERTY_DAMAGE
Categorical
IMBALANCE  MISSING 
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 1528858 |
| Missing (%) | 35.8% |
| Memory size | 32.6 MiB |
| N | |
|---|---|
| Unspecified | |
| Y | 15906 |
Length
| Max length | 11 |
|---|---|
| Median length | 1 |
| Mean length | 2.2106418 |
| Min length | 1 |
Characters and Unicode
| Total characters | 6068939 |
|---|---|
| Distinct characters | 11 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | N |
|---|---|
| 2nd row | N |
| 3rd row | N |
| 4th row | N |
| 5th row | N |
Common Values
| Value | Count | Frequency (%) |
| N | 2397062 | |
| Unspecified | 332361 | 7.8% |
| Y | 15906 | 0.4% |
| (Missing) | 1528858 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| n | 2397062 | |
| unspecified | 332361 | 12.1% |
| y | 15906 | 0.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| N | 2397062 | |
| e | 664722 | 11.0% |
| i | 664722 | 11.0% |
| U | 332361 | 5.5% |
| n | 332361 | 5.5% |
| s | 332361 | 5.5% |
| p | 332361 | 5.5% |
| c | 332361 | 5.5% |
| f | 332361 | 5.5% |
| d | 332361 | 5.5% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 3323610 | |
| Uppercase Letter | 2745329 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 664722 | |
| i | 664722 | |
| n | 332361 | |
| s | 332361 | |
| p | 332361 | |
| c | 332361 | |
| f | 332361 | |
| d | 332361 |
Uppercase Letter
| Value | Count | Frequency (%) |
| N | 2397062 | |
| U | 332361 | 12.1% |
| Y | 15906 | 0.6% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 6068939 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| N | 2397062 | |
| e | 664722 | 11.0% |
| i | 664722 | 11.0% |
| U | 332361 | 5.5% |
| n | 332361 | 5.5% |
| s | 332361 | 5.5% |
| p | 332361 | 5.5% |
| c | 332361 | 5.5% |
| f | 332361 | 5.5% |
| d | 332361 | 5.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 6068939 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| N | 2397062 | |
| e | 664722 | 11.0% |
| i | 664722 | 11.0% |
| U | 332361 | 5.5% |
| n | 332361 | 5.5% |
| s | 332361 | 5.5% |
| p | 332361 | 5.5% |
| c | 332361 | 5.5% |
| f | 332361 | 5.5% |
| d | 332361 | 5.5% |
PUBLIC_PROPERTY_DAMAGE_TYPE
Text
MISSING 
| Distinct | 19998 |
|---|---|
| Distinct (%) | 72.9% |
| Missing | 4246765 |
| Missing (%) | 99.4% |
| Memory size | 32.6 MiB |
Length
| Max length | 866 |
|---|---|
| Median length | 383 |
| Mean length | 38.235468 |
| Min length | 1 |
Characters and Unicode
| Total characters | 1048493 |
|---|---|
| Distinct characters | 61 |
| Distinct categories | 11 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 18975 ? |
|---|---|
| Unique (%) | 69.2% |
Sample
| 1st row | UTILITY POLE |
|---|---|
| 2nd row | PASSENGER FRONT SIDE DAMAGED |
| 3rd row | FENCE OF A SCHOOL IN THE BACK |
| 4th row | POWERLINE CABLES IN FRONT OF 4236 BEDFORD AVENUE |
| 5th row | BRICK FENCE WAS STRUCK BY MV1 WHEN TRYING TO PARK. |
| Value | Count | Frequency (%) |
| fence | 6459 | 3.6% |
| of | 6198 | 3.4% |
| and | 4957 | 2.7% |
| to | 4591 | 2.5% |
| the | 4114 | 2.3% |
| pole | 3913 | 2.2% |
| damage | 3530 | 2.0% |
| front | 3083 | 1.7% |
| vehicle | 2797 | 1.5% |
| light | 2567 | 1.4% |
| Other values (12108) | 138618 |
Most occurring characters
| Value | Count | Frequency (%) |
| 153405 | ||
| E | 102434 | 9.8% |
| A | 71928 | 6.9% |
| T | 68243 | 6.5% |
| O | 64902 | 6.2% |
| N | 64037 | 6.1% |
| I | 57625 | 5.5% |
| R | 55324 | 5.3% |
| D | 44189 | 4.2% |
| L | 41926 | 4.0% |
| Other values (51) | 324480 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 848736 | |
| Space Separator | 153405 | 14.6% |
| Decimal Number | 27887 | 2.7% |
| Other Punctuation | 15234 | 1.5% |
| Dash Punctuation | 1815 | 0.2% |
| Open Punctuation | 651 | 0.1% |
| Close Punctuation | 650 | 0.1% |
| Currency Symbol | 85 | < 0.1% |
| Math Symbol | 17 | < 0.1% |
| Control | 11 | < 0.1% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| E | 102434 | |
| A | 71928 | 8.5% |
| T | 68243 | 8.0% |
| O | 64902 | 7.6% |
| N | 64037 | 7.5% |
| I | 57625 | 6.8% |
| R | 55324 | 6.5% |
| D | 44189 | 5.2% |
| L | 41926 | 4.9% |
| S | 41423 | 4.9% |
| Other values (16) | 236705 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 8262 | |
| , | 2878 | 18.9% |
| / | 1968 | 12.9% |
| # | 941 | 6.2% |
| ' | 550 | 3.6% |
| & | 218 | 1.4% |
| : | 138 | 0.9% |
| @ | 115 | 0.8% |
| ? | 73 | 0.5% |
| ; | 67 | 0.4% |
| Other values (2) | 24 | 0.2% |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 6725 | |
| 2 | 4150 | |
| 0 | 3314 | |
| 3 | 2584 | 9.3% |
| 5 | 2248 | 8.1% |
| 4 | 2212 | 7.9% |
| 6 | 1769 | 6.3% |
| 8 | 1681 | 6.0% |
| 7 | 1671 | 6.0% |
| 9 | 1533 | 5.5% |
Math Symbol
| Value | Count | Frequency (%) |
| = | 8 | |
| + | 7 | |
| ~ | 1 | 5.9% |
| > | 1 | 5.9% |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 650 | |
| [ | 1 | 0.2% |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 649 | |
| ] | 1 | 0.2% |
Space Separator
| Value | Count | Frequency (%) |
| 153405 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 1815 |
Currency Symbol
| Value | Count | Frequency (%) |
| $ | 85 |
Control
| Value | Count | Frequency (%) |
| | 11 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 2 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 848736 | |
| Common | 199757 | 19.1% |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 153405 | ||
| . | 8262 | 4.1% |
| 1 | 6725 | 3.4% |
| 2 | 4150 | 2.1% |
| 0 | 3314 | 1.7% |
| , | 2878 | 1.4% |
| 3 | 2584 | 1.3% |
| 5 | 2248 | 1.1% |
| 4 | 2212 | 1.1% |
| / | 1968 | 1.0% |
| Other values (25) | 12011 | 6.0% |
Latin
| Value | Count | Frequency (%) |
| E | 102434 | |
| A | 71928 | 8.5% |
| T | 68243 | 8.0% |
| O | 64902 | 7.6% |
| N | 64037 | 7.5% |
| I | 57625 | 6.8% |
| R | 55324 | 6.5% |
| D | 44189 | 5.2% |
| L | 41926 | 4.9% |
| S | 41423 | 4.9% |
| Other values (16) | 236705 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1048493 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 153405 | ||
| E | 102434 | 9.8% |
| A | 71928 | 6.9% |
| T | 68243 | 6.5% |
| O | 64902 | 6.2% |
| N | 64037 | 6.1% |
| I | 57625 | 5.5% |
| R | 55324 | 5.3% |
| D | 44189 | 4.2% |
| L | 41926 | 4.0% |
| Other values (51) | 324480 |
MISSING 
| Distinct | 61 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 153529 |
| Missing (%) | 3.6% |
| Memory size | 32.6 MiB |
Length
| Max length | 53 |
|---|---|
| Median length | 11 |
| Mean length | 16.335901 |
| Min length | 1 |
Characters and Unicode
| Total characters | 67314662 |
|---|---|
| Distinct characters | 55 |
| Distinct categories | 8 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Unspecified |
|---|---|
| 2nd row | Driver Inattention/Distraction |
| 3rd row | Driver Inattention/Distraction |
| 4th row | Unspecified |
| 5th row | Other Vehicular |
| Value | Count | Frequency (%) |
| unspecified | 2417427 | |
| driver | 567912 | 8.5% |
| inattention/distraction | 527014 | 7.9% |
| too | 198897 | 3.0% |
| closely | 198897 | 3.0% |
| to | 175570 | 2.6% |
| failure | 152517 | 2.3% |
| yield | 145302 | 2.2% |
| right-of-way | 145302 | 2.2% |
| following | 136420 | 2.0% |
| Other values (96) | 2001668 |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 8755933 | |
| e | 8221225 | 12.2% |
| n | 5913284 | 8.8% |
| s | 4151947 | 6.2% |
| t | 3541043 | 5.3% |
| c | 3495358 | 5.2% |
| r | 3022883 | 4.5% |
| o | 2950811 | 4.4% |
| d | 2910010 | 4.3% |
| f | 2856966 | 4.2% |
| Other values (45) | 21495202 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 56569759 | |
| Uppercase Letter | 7231582 | 10.7% |
| Space Separator | 2546268 | 3.8% |
| Other Punctuation | 668620 | 1.0% |
| Dash Punctuation | 293059 | 0.4% |
| Open Punctuation | 2553 | < 0.1% |
| Close Punctuation | 2553 | < 0.1% |
| Decimal Number | 268 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| i | 8755933 | |
| e | 8221225 | |
| n | 5913284 | |
| s | 4151947 | 7.3% |
| t | 3541043 | 6.3% |
| c | 3495358 | 6.2% |
| r | 3022883 | 5.3% |
| o | 2950811 | 5.2% |
| d | 2910010 | 5.1% |
| f | 2856966 | 5.1% |
| Other values (15) | 10750299 |
Uppercase Letter
| Value | Count | Frequency (%) |
| U | 2689907 | |
| D | 1276856 | |
| I | 741471 | 10.3% |
| F | 358478 | 5.0% |
| C | 351725 | 4.9% |
| T | 312161 | 4.3% |
| P | 231120 | 3.2% |
| R | 201358 | 2.8% |
| O | 174903 | 2.4% |
| L | 168002 | 2.3% |
| Other values (12) | 725601 | 10.0% |
Decimal Number
| Value | Count | Frequency (%) |
| 8 | 126 | |
| 0 | 126 | |
| 1 | 16 | 6.0% |
Space Separator
| Value | Count | Frequency (%) |
| 2546268 |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 668620 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 293059 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 2553 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 2553 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 63801341 | |
| Common | 3513321 | 5.2% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| i | 8755933 | |
| e | 8221225 | |
| n | 5913284 | 9.3% |
| s | 4151947 | 6.5% |
| t | 3541043 | 5.6% |
| c | 3495358 | 5.5% |
| r | 3022883 | 4.7% |
| o | 2950811 | 4.6% |
| d | 2910010 | 4.6% |
| f | 2856966 | 4.5% |
| Other values (37) | 17981881 |
Common
| Value | Count | Frequency (%) |
| 2546268 | ||
| / | 668620 | 19.0% |
| - | 293059 | 8.3% |
| ( | 2553 | 0.1% |
| ) | 2553 | 0.1% |
| 8 | 126 | < 0.1% |
| 0 | 126 | < 0.1% |
| 1 | 16 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 67314662 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| i | 8755933 | |
| e | 8221225 | 12.2% |
| n | 5913284 | 8.8% |
| s | 4151947 | 6.2% |
| t | 3541043 | 5.3% |
| c | 3495358 | 5.2% |
| r | 3022883 | 4.5% |
| o | 2950811 | 4.4% |
| d | 2910010 | 4.3% |
| f | 2856966 | 4.2% |
| Other values (45) | 21495202 |
MISSING 
| Distinct | 56 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 1694521 |
| Missing (%) | 39.6% |
| Memory size | 32.6 MiB |
Length
| Max length | 53 |
|---|---|
| Median length | 11 |
| Mean length | 13.768191 |
| Min length | 1 |
Characters and Unicode
| Total characters | 35517335 |
|---|---|
| Distinct characters | 53 |
| Distinct categories | 8 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Unspecified |
|---|---|
| 2nd row | Unsafe Lane Changing |
| 3rd row | Unspecified |
| 4th row | Unspecified |
| 5th row | Unspecified |
| Value | Count | Frequency (%) |
| unspecified | 2019132 | |
| driver | 178999 | 5.1% |
| inattention/distraction | 144627 | 4.1% |
| too | 88000 | 2.5% |
| closely | 88000 | 2.5% |
| lane | 63931 | 1.8% |
| passing | 61819 | 1.8% |
| following | 60394 | 1.7% |
| unsafe | 58227 | 1.7% |
| to | 53054 | 1.5% |
| Other values (94) | 682176 | 19.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 5280674 | |
| i | 5225177 | |
| n | 3155455 | |
| s | 2584018 | 7.3% |
| c | 2320563 | 6.5% |
| p | 2216799 | 6.2% |
| d | 2186188 | 6.2% |
| f | 2182232 | 6.1% |
| U | 2133085 | 6.0% |
| r | 979534 | 2.8% |
| Other values (43) | 7253610 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 30684861 | |
| Uppercase Letter | 3638339 | 10.2% |
| Space Separator | 918693 | 2.6% |
| Other Punctuation | 184068 | 0.5% |
| Dash Punctuation | 88576 | 0.2% |
| Open Punctuation | 1380 | < 0.1% |
| Close Punctuation | 1380 | < 0.1% |
| Decimal Number | 38 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 5280674 | |
| i | 5225177 | |
| n | 3155455 | |
| s | 2584018 | |
| c | 2320563 | |
| p | 2216799 | |
| d | 2186188 | |
| f | 2182232 | |
| r | 979534 | 3.2% |
| o | 973781 | 3.2% |
| Other values (15) | 3580440 |
Uppercase Letter
| Value | Count | Frequency (%) |
| U | 2133085 | |
| D | 364080 | 10.0% |
| I | 243450 | 6.7% |
| C | 145907 | 4.0% |
| T | 128054 | 3.5% |
| F | 111111 | 3.1% |
| P | 88189 | 2.4% |
| L | 73424 | 2.0% |
| R | 69315 | 1.9% |
| O | 51544 | 1.4% |
| Other values (12) | 230180 | 6.3% |
Space Separator
| Value | Count | Frequency (%) |
| 918693 |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 184068 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 88576 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 1380 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 1380 |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 38 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 34323200 | |
| Common | 1194135 | 3.4% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 5280674 | |
| i | 5225177 | |
| n | 3155455 | |
| s | 2584018 | |
| c | 2320563 | 6.8% |
| p | 2216799 | 6.5% |
| d | 2186188 | 6.4% |
| f | 2182232 | 6.4% |
| U | 2133085 | 6.2% |
| r | 979534 | 2.9% |
| Other values (37) | 6059475 |
Common
| Value | Count | Frequency (%) |
| 918693 | ||
| / | 184068 | 15.4% |
| - | 88576 | 7.4% |
| ( | 1380 | 0.1% |
| ) | 1380 | 0.1% |
| 1 | 38 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 35517335 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 5280674 | |
| i | 5225177 | |
| n | 3155455 | |
| s | 2584018 | 7.3% |
| c | 2320563 | 6.5% |
| p | 2216799 | 6.2% |
| d | 2186188 | 6.2% |
| f | 2182232 | 6.1% |
| U | 2133085 | 6.0% |
| r | 979534 | 2.8% |
| Other values (43) | 7253610 |